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Abstract. In recent years, social media has played an increasingly important role in reporting 
world events. The publication of crowd-sourced photographs and videos in near real-time is one of 
the reasons behind the high impact. However, the use of a camera can draw the photographer into 
a situation of conflict. Examples include the use of cameras by regulators collecting evidence of 
Mafia operations; citizens collecting evidence of corruption at a public service outlet; and political 
dissidents protesting at public rallies. In all these cases, the published images contain fairly unam- 
biguous clues about the location of the photographer (scene viewpoint information). In the pres- 
ence of adversary operated cameras, it can be easy to identify the photographer by also combining 
leaked information from the photographs themselves. We call this the camera location detection 
attack. We propose and review defense techniques against such attacks. Defenses such as image 
obfuscation techniques do not protect camera-location information; current anonymous publication 
technologies do not help either. However, the use of view synthesis algorithms could be a promising 
step in the direction of providing probabilistic privacy guarantees. 

1 Introduction 

Cameras are becoming ubiquitous. In the near future almost everyone will carry a personal high 
quality camera included in a mobile computing device. Furthermore, surveillance cameras are also 
increasing in spaces frequented by the general public. In the last few years, the quality of such 
cameras has significantly increased with an increasing number of deployments on the streets. In the 
following paragraphs we highlight novel threats to privacy when adversary operated cameras point 
at other camera users. 

In a world where cameras are ubiquitous the privacy of individuals is constantly at stake. The 
release of pictorial information both careful and careless has historically been considered a potential 
privacy risk based on the leakage of information contained in the picture itself (private information 
contained in the captured scene). In this context the use of ad-hoc obfuscation methods such as 
blurring, pixelation and inpainting algorithms (T) have been proposed. Subsequent work offered 
privacy metrics and proposed better solutions on images Q and videos Q]. 

Little attention has been paid to a possibly more important threat in the context of camera ubiquity: 
the privacy of the photographer taking an image (privacy of location from which the shots have 
been taken). Indeed, the average camera person expects to remain anonymous when publishing 
an image via anonymous channels on an anonymous publishing system, however, this is not the 
case as images leak potentially private information that can lead to the de-anonymization of the 
photographer. 

As a motivating example consider a secret meeting where the Mafia is announcing new rates for 
getting access to various public services. The Mafia's security keep a tight eye on the gathering to 
prevent any overt photography. They setup CCTVs to cover the venue but do not actively monitor 
the footage. A few journalists in the guise of businessmen are carrying covert cameras to record the 
event and promptly publish their images. When the embarrassing pictures turn up on the Internet, 
the Mafia uses the published images to locate the set of possible camera locations and then leverage 
CCTV input to identify the photographer (or at least to significantly minimize the anonymity set). 



This is a real threat faced by increasing numbers of members of the general public. During the 
Burmese emergency of 2007 [111 , protestors taking photographs at explosive anti-government ral- 
lies were identified and hunted down by crews of government secret police collecting footage of 
protestors in attendance using mobile video cameras in a bid to censor 'sensitive' footage. Similar 
video intelligence teams have since been funded by several state and non-state actors. We foresee 
the usage of hostile cameras along with camera location deanonymization as a serious threat to 
parties engaging in activities involving taking pictures in hostile environments. 
Our goal is to develop methods and techniques to address leakage of camera location information. 
Essentially, we hope to investigate novel techniques of image processing that seek to maximize the 
privacy of the camera user or, to use a term from anonymous communications parlance, to maximize 
the unlinkability between an image and the photographer. 

2 Problem Description, Threat Model and Solution Approach 

In this paper, we wish to discuss solutions to the problem of photographer identification or pho- 
tographer deanonymization, i.e., given a photograph we wish to anonymize the photographer. In 
particular, we focus on maximizing camera-location anonymity. 

Camera location anonymity problem: Given a set of people P within a scene where fc < |P| 
people are recording (not necessarily from the same location) and publishing images to the Internet. 
We wish to maximize the unlinkability between a published image (output) and the location from 
which it was captured (input). Anonymity is achieved by minimizing the correlation between raw 
input and published images. This is similar to the function of a mix in anonymous communications, 
where the correlation between input and output traffic streams is minimized. 

Attack: Given a published image, an attacker can infer on the set of potential locations from which 
an image could have been captured by examining the scene viewpoint. This is because a given 
camera at a scene will record the image of the scene from a certain viewpoint. In this way, the 
recorded image will reveal some information about the location of the camera. An adversary can 
mount a camera location detection attack by combining information from a leaked (or published) 
image and footage from adversary cameras operating within the scene. 

The image may also reveal information specific to the camera itself such as aberrations in the lens 
or the CCD which can be used to specifically identify the camera used in capturing a scene I2I8I6I . 
Such 'side-channel' information can subsequently be used to identify the photographer. However, 
we do not seek to defend against such attacks in this work. 

Threat model: Our threat model is that of a global passive adversary who deploys surveillance 
probes (video cameras). However, we assume that the adversary does not have the resources to 
analyze camera footage in real-time but rather that all video analysis is post-event. 

Approach: Our approach towards maximizing camera location is the following. Instead of pub- 
lishing an image downloaded from the camera's memory card, the photographer records multiple 
images (or frames) of the scene of interest from different viewpoints. She then generates a new 
image from a randomly chosen viewpoint using the input images and publishes the result. Since 
the scene viewpoint could potentially point anywhere within the physical space of interest within 
limits, the photographer's anonymity can hopefully be maximized to the entire set of people present 
in that space. A view synthesis algorithm takes two images and an input viewpoint and generates a 
synthesized image from that viewpoint using the input images. There is a large body of literature 
on this in the area of computer vision. View synthesis depends on a stereo correspondence algo- 
rithm applied on the input images followed by forward or inverse warping steps and a hole-handling 
step ||9]- Each of these steps corresponds to a sub-algorithm. For a detailed review of view synthesis 
the reader is referred to the work of Scharstein and Szeliski |I0| . 



3 Privacy Analysis 



Let II and I_r be two images of the same scene but taken from two different viewpoints, re- 
spectively left and right, and corresponding to the two camera positions L — (xljUl, zl) and 
R ~ {xr, hr, zr). The objective of a view synthesis algorithm is to compute a synthesized image 
Is at a new viewpoint or camera position S — {xs,ys, zs)- This new image aims at protecting 
the anonymity of the photographers. In what follows, we investigate to which extent this anonymity 
can be preserved. Concretely, given a synthesized image Ig, how much can be inferred about the 
photographers positions L and R ? 

To answer this question, we start by identifying at which level privacy can be leaked in a classical 
stereo-based view synthesis algorithm. Such an algorithm starts by establishing a stereo correspon- 
dence between II and I_r. This means estimating a disparity map D that will enable relatingjthe 
pixel coordinates = (ul, vl) on \l to the pixel coordinates plj = (u_r, vr) on \r, such that, 

{ur,vr) = {UL +'D{ul,vl),vl). (1) 

For a realistic rendering of the synthesized image Is, it is necessary to find a dense disparity map 
that relates all pairs of points (pl, Pi?,). To that end, it is necessary to find the most optimal match 
between II and 1r. If this correspondence is perfectly achieved, the only possible errors can only 
be due to geometrical constraints such as occlusions. These errors are contained in D, and often 
translated on Is as holes. The location of these holes may strongly infer on the relative geometry 
of the scene as seen from L and R; which implies inferring on L and R if the scene geometry is 
known or can be discovered. 

Once D is estimated, the image Is can be computed by interpolation or extrapolation, depending 
whether xs is chosen to be inside or outside the interval [xl, xr]. Hence, at a given pixel location 
[u, v), and for a baseline 6 = (xr — xl). Is may be written as: 

Is{u, v)=[l- (^^y^)) ■iL{u + D{u, v), v) + (^^y^) • «). (2) 

In what follows, we investigate the nature and the extent of the occlusions that cause privacy leakage 
for the two distinctive cases of extrapolation and interpolation. 

3.1 Privacy Leakage on Extrapolated Records 

The observation this section is based upon is that independently of the object recorded there may 
be occlusion beside the object on the extrapolated record. The reason for this occlusion is that, 
in specific setups, neither of the two original cameras can record what is behind the object as 
the object itself occludes that part of the scene. Now, if one is trying to establish an extrapolated 
record based on such two original records, then the final record will contain the projection of the 
occluded scene part. The projection of the occluded scene part is called hole. The existence of holes 
is independent of the effectiveness of the stereo correspondence subroutine of the view synthesis 
algorithm. Even in case of perfect matching, the occluded scene parts cannot be mitigated from the 
synthesized record. Of course, view synthesis algorithms incorporate a hole filling step, but this is 
usually based on some kind of interpolation between the hole surroundings, and as such, filled holes 
are easily observable and measurable by humans (see, for example. Fig. I. in J?)). In the following, 
we assume that the observer is given an extrapolated picture containing the holes. The observer (or 
big brother) is trying to recover the position of the original cameras, thus trying to de-anonymize 
the photographers using the side-channel information given by the position and size of the holes. 
In fact, there are two slightly different setups when holes appear on the synthesised picture. These 
setups are characterized by the 

- Position of the left camera L, position of the right camera R and the distance between them b. 

- Position (xs ,ys , zs) of the synthesized viewpoint S, and its distance s from R. 



^ For simplicity, we assume a disparity change on the x axis only. 



- Object size I and its position, particularly its distance Zo from the focal plane and size I of its 
projection on the image plane. 

- Distance Zt of the background from the cameras, and the focal plane. 

- Position of the planes (i.e., focal plane, image plane), particularly the focal length /. 

- Hole size h on the image plane. 

Fig. |l(a)| and Fig. |l(b)| depict the two slightly different setups when holes appear. In this simplified 
scenario the focal plane, the image plane and the background are parallel, and the cameras' fields of 
view are assumed to be 180° . The object to record (depicted by the thick black line in the middle) is 
also parallel to the planes. The grey regions in Fig. |l(a)| and Fig. |l(b)| show the part of the scene that 
is not seen by either of the cameras L and R. In other words, these are the occluded regions. As long 
as there is no object to record in these regions, no information will be lost because of the occlusion. 
However, as the background also has an occluded part, a hole will appear as the projection of these 
occluded parts (which are highlighted in red on the background planes) on the image plane. These 
holes are denoted by h' and h" , respectively, highlighted in red on the image planes (see Fig. |l(a)| 
andFig.[T(b)ll. 

Note that, in this example we only consider the case when S is to the right from R. We do not 
lose on generality with this assumption, as S being to the left from L would result in the same 
derivation. Note that there is also a third possible setup, namely when the left ray of camera L and 
the right ray of camera R are intersecting each other before the background. In this case, however, 
the background has no occluded part and therefore no hole will appear on the image plane. We will 
return to this case later on, after having investigated the cases when holes appear. 
h' and h" can be calculated with the help of coordinate geometry. By attaching coordinates to the 
points defining the projection lines of the cameras, h' and h" will take the following forms (note 
that h' , h" > when holes appear): 

Based on h' and h" , we can already formulate the general equation for h, namely. 



By introducing 



, min(/i',/i") if ft" > 
'^"^O ifh"<0 



(6) 



h can be rewritten as 



I \ Zt 
min(QS, I — ab) if h" > 

ifh"<0 ^' 

After this derivation, the question is what effect does the above has on the privacy of the collabo- 
rating photographers. In other words, we are interested in the extent to which s and b can be infered 
on from Eq. l|7]l. The knowledge of s and b together would reveal the photographers directly be- 
cause, obviously, the virtual coordinates of the synthesized record's focal point are known (from 
the synthesized image itself) and s and b would give the locations of L and R. Naturally, this would 
mean that the privacy (or more precisely, anonymity) of the photographers is zero. If either s or & 
is unknown, then the anonymity of the photographers is higher. If both s and b are unknown, then 
the anonymity is 1. In the following, we will quantify the anonymity of the photographers more 
precisely based on information leaked by Eq. Q. 

For the further analysis, we assume that h and I are known. The magnification of both h and I are 
measurable on the synthesized picture, therefore the knowledge of their original value depends only 
on the knowledge of the CCD/CMOS size of the original cameras (as h and I are measured on the 



background 



focal plane ▼ ▼ 




(a) First setup giving h' 




(b) Second setup giving h" 
Fig. 1. Tlie two different setups thiat result in holes on the synthesized picture 



image plane, which is the CCD/CMOS in this case). There are only a few different CCD/CMOS 
sizes; the typical CCD/CMOS size for professional cameras is 36x24 mm. We note, however, we 
do not know whether the value of h is the representation of h' or h" . We further assume that I and 
/ can be guessed by the observer. I is the length of the object which, in most of the cases, has well- 
known dimensions. For example, if a speaker is recorded then I is about 45-55 cm (measured at the 
shoulders), depending on the gender. The value of / (i.e., the focal length) is guessable by knowing 
that specific scenes require specific / values. Still considering the example with the speaker, the 
optimal focal length for capturing him/her would be between 85 and 100 mm, as this is the focal 
length interval most suitable for (full-length) portraits. Finally, we also assume that Zo and Zb are 
guessable as well. The guessability of the latter two parameters is an implication of the previous as- 



sumptions, namely of the guessability of I, I and /. We note, however, that Zt can only be guessed 
if the background is not textureless. 
Now, from Eq. ([7} we know that 



i) h — as 



ii) h = I — ah, 



(8) 



but we do not know which of the two cases holds. If the first case prevails then 

h 



b < 



a 



(9) 



(10) 



which means that the observer knows the position of R precisely and has an upper bound for b (i.e., 
for the distance between L and R). Otherwise, if the second case prevails then 



h 

s> -, 
a 



l-h 



(11) 



(12) 



meaning that we have a lower bound on the distance of R from 5* and we know b precisely. 
In the following, we assume that n photographers are positioned with their cameras along a section 
of the focal plane. We refer to the two ends of this section as P and Q, left to right, with coordinates 
{xp,zp) and {xq, zq), respectively. Note that the worst case from the observer's point of view who 
is aiming at de-anonymization is when s does not restrict the anonymity set of the photographers, 
i.e., when the knowledge of s does not exclude any of the suspected photographers. This happens 
if, in the first case, xs ~ s > xp + b, and if, in the second case, xs — xq > s. In the further analysis 
we will assume this worst case, i.e., the results at the end will be conservative from the observer's 
point of view. 

By not assuming a single mandatory position for the cameras relative to the photographers body 
(i.e., the camera is not necessarily located at the centerline of the torso), there are |^ j j possible pairs 
of journalists who are suspicious for being the original recorders in the first case. This is because, 
in the first case, R is known and L is within distance b to R. If we consider that the journalists are 
standing shoulder-to-shoulder with width I, the above statement becomes clear. In the second case, 
there are — |" j j ^ + (ji — |^ jj j possible pairs of suspicious journalists. 
We can now quantify the anonymity of the photographers following (3) as 

. _ H{X) 



(13) 



where H stands for entropy. In our case, i.e., when all the pairs within the anonymity set are equally 
suspicious, the latter equation can be simplified as 



■ log 



log ■ 



EN 1 1 
.= 1 N log N 



lof 



N 



= logjY Nsusp, 



(14) 



where A'^ 



i(n-l) 



is the number of possible pairs of photographers. 



When calculating anonymity, the two cases in Eq. {8} have to be considered together but without 
their intersection calculated twice. Therefore, having n photographers results in 



Ar(/i>0) 
susp 



+ 



+ 



2n 



(15) 



suspicious pairs (in other words, the size of the anonymity set is N^'^'^p') in case when h > 0, 
where b can be calculated as 

b^J^. (16) 



1 _ Zo 



As the components of these equations are known or guessable, the anonymity of the photographers 
can be evaluated using Eq. \151 in case h > 0. 

In fact, even the non-existence of holes reveals private information. In order to mitigate holes on 
the synthesized picture, L and R must reside on one of the specific constellations resulting in no 
occlusion; and the number of such constellations is geometrically limited. Such constellations are 
characterized by the fact that the left ray of camera L and the right ray of camera R have to intersect 
each other before the background plane in order to avoid meaningful occlusions. In such cases h" 
can be negative. 

In the latter case, i.e., when h = 0, then either /i' < or h" < 0. Since h' is always positive, the 
former implies that 

b>-, (17) 

a 

which is the straightforward opposite of the previous cases when h" was positive. Relying on the 
previous calculations, the number of suspicious pairs in this case is as follows: 



' susp / _j 



(18) 



The summation in Eq. 1 II8I 1 is justified by the observation that the ft = case is similar to the above 
second case when h was known and s did not convey information in the worst case to the observer. 
Eq. does not reveal any hint on s, and it can be rewritten as a sum of specific h values, expressed 
in a discrete way with the help of fe when formulating anonymity. Finally, by simplifying Eq. JlSl l. 
one gets the following expression for the number of suspicious pairs when ft = 0: 



+ 2 



+ (19) 



where h can be calculated with the help of Eq. ( II6I 1 considering that ft = 0. Anonymity can be 
further evaluated using Eq. l lI4b . 

To give an example, let us assume a press conference scenario with a speaker and n journalists. The 
speaker is facing the journalists who are aligned beside each other, two of them stealthily recording 
the speech. Later these two will collaboratively establish a synthesized, extrapolated record with 
or without holes beside the speaker depending on the geometrical setup. In this scenario, one can 
calculate the anonymity of the photographers using Eqs. ( 114b . l ll5b and l |I9b . Some typical values 
of parameters required for the evaluation could be as follows: 

- 71 = 20 (number of journalists) 

- / = 0.5 m (length of object, in this case shoulder width) 

- / = 5 mm (length of projection of the object on the CCD/CMOS) 

- = 5 m (distance between the focal plane (where the journalists are standing) and the 
speaker) 

- Zj, = 7 m (distance between the focal plane and the background) 

- ft = 1 mm (size of the hole beside the speaker measured on the CCD/CMOS if there is any, 
otherwise ft = 0) 

With these values, the anonymity of the photographers is 

r0.688 ifft>0 
\0.958 ifft = 

This tells us that the existence of holes reveals a large amount of position information (the anonymity 
of the photographers is reduced from 1 to 0.688), but even the non-existence of holes is transpiring 
some private information. According to (3), anonymity is considered to be preserved when, roughly, 
A > 0.8. Based on this, we can conclude that in case holes appear on the synthesized picture, the 
level of anonymity of the photographers is not adequate. 




Fig. 2. View synthesis setup in the case of interpolation 



3.2 Privacy Leakage on Interpolated Records 

In the same way as in the case of extrapolated view synthesis, we assume that the observer is given 
an interpolated picture Is- The difference this time is that xs G [xl,xr]. The observer tries to 
recover the position of the original cameras L and R from the information contained in the holes on 
Is. We consider the simplified setup depicted in Fig. l3.2l with the same assumptions on the cameras 
as in Section im We model the object to record with the profile depicted by the thick black line, 
and defined by the function z = J-{x), where the x axis is chosen for simplicity to overlap with the 
background, and the origin o = (0, 0) of the x and z axes approximately coincides with the center 
of the support of J-, i.e., T : x £ [— X_f, Xj-] — > K. 

The occlusions causing errors on the disparity map D, or in the worst case causing holes, are shown 
on Fig. l3.2l as Hl and Hh. Hl corresponds to the occluded region on the object relative to the left 
camera, while He. is the one relative to the right camera. To find Hl, for instancfl we need to 
find the internal tangent line on that intersects with L. For simplicity, we assume that this line 
intersects at a single point Pl ~ {xp^, zp^). Thus, we may define the line (LPl) as: 

(LPl) : z = T'{xp^) ■ {x - xp^)+T{xpJ and n (LPl) = {Pl}- (21) 

We use polar coordinates (tL,rL) to define the point Pl such that: 

xp^ = rL ■ cos(tL) and zp^ = tl ■ sin(tL). (22) 

We approximate the expression of T around Pl by a circle centered at the origin o, and with a 
known radius tl ~ r. Finding Hl becomes equivalent to finding tL- Given Zo, the distance of 
the object plane from the cameras, we replace the coordinates of L — {xl, Zo) in the equation of 
(LPl), and solve for sin(tL). We find: 



2r^Zo ± XL^iZi+ x'i-r'^ 
4Z2 + xl 



sm(tL) = , , , (23) 



Therefore: 

HL=XT-r\costL\. (24) 



same approach applies to Hp. 



Similarly for the right occlusions, we define ta using l l23t , with xl replaced by xr. We find: 



Hr = Xj 



- r\ cosIr]. 



(25) 



The resulting Hr and Hl correspond to the real sizes of the occlusion regions on the object. The 
corresponding hole sizes on the image planes are: 



hr 



f f 
— ■ {Xjr — rcostL) and Hr — — ■ (Xjr — rcostfl) 



(26) 



To evaluate the privacy leakage, one needs to relate the size of the holes on Ig to xl and xr. We 
assume that these holes/errors, that originate from errors on D, do no get further altered by 
Based on the position of these holes on Is, an observer can guess three distinct cases: 

1. {xr X Xl) < 0: The two cameras are on different sides of the object. The total hole size is 
consequently h — [Hl + hR). In this case, xl and x_r are defined precisely from II and Ir, 
respectively. Indeed, by plugging {xl, Zo) and {xr, Zo) in the line equations of {LPl) and 
(RPr), we find; xl = 2 ■ tl sin(t_L) — Zo tan(t_L), and xr — 2 ■ tr sm{tR) — Zo tan(f_R). 
This means that positions of the photographers are known, i.e., the number of suspicious pairs 
of positions {L, R) is Nsusp ~ 1, and consequently from Eq. ( I14t . anonymity A — 0. 

2. (xr X Xl) > and < xl < xr : The two cameras are on the right side of the object. In 
this case h — max(/ii, Hr) — Hr. This means that only the right camera position R can be 
known precisely, and for the considered profile F, we have 



Xr 
I 



^(^2 -rR sin(tj{) - Zo tan(tfl) 



(27) 



3. {xr X Xl) 



> and Xl < Xr < 
max(/iL, hR 



The two cameras are on the left side of the object. In 
h L ■ This time only the left camera position L can be found 



this case h 

precisely, and similarly to Eq. l l27l l. for the considered profile T, we find 



^ susp — 



XL 
I 



2 ■ tl sin(fi) - Zo tan(tL)^ 



(28) 



3.3 Conclusion 

As social media coverage increases and exceeds the coverage of traditional media, privacy is an 
increasingly important problem. In this paper, we have highlighted the threat of camera location 
detection attacks mounted by an adversary that combines location clues from published photographs 
from adversary operated cameras in the vicinity of the photographer. In a world that is increasingly 
getting saturated with cameras, this is an important privacy problem. 

Preliminary investigations on analyzing current view synthesis algorithms indicate decent anonymity 
gains for modest computational effort. This indicates a fruitful line of enquiry in developing defense 
techniques against camera location detection attacks and, in turn, defend against the larger class of 
photographer de-anonymization attacks. 
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